Text Comparison Using Machine-Generated Nuggets

نویسنده

  • Liang Zhou
چکیده

This paper describes a novel text comparison environment that facilities text comparison administered through assessing and aggregating information nuggets automatically created and extracted from the texts in question. Our goal in designing such a tool is to enable and improve automatic nugget creation and present its application for evaluations of various natural language processing tasks. During our demonstration at HLT, new users will able to experience first hand text analysis can be fun, enjoyable, and interesting using system-created nuggets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Semi-Automatic Evaluation Scheme: Automated Nuggetization for Manual Annotation

In this paper we describe automatic information nuggetization and its application to text comparison. More specifically, we take a close look at how machine-generated nuggets can be used to create evaluation material. A semiautomatic annotation scheme is designed to produce gold-standard data with exceptionally high inter-human agreement.

متن کامل

Identifying Nuggets of Information in GALE Distillation Evaluation

This paper describes an approach to automatic nuggetization and implemented system employed in GALE Distillation evaluation to measure the information content of text returned in response to an open-ended question. The system identifies nuggets, or atomic units of information, categorizes them according to their semantic type, and selects different types of nuggets depending on the type of the ...

متن کامل

Effective Structured Query Formulation for Session Search

In this work, we emphasize on formulating effective structured queries for session search. For a given query, phrase-like text nuggets are identified and formulated into Lemur queries to feed into the Lemur search engine. Nuggets are substrings in qn, similar to phrases but not necessarily as semantically coherent as phrases. We assume that a valid nugget appears frequently in top returned snip...

متن کامل

Answer Extraction for Definition Questions using Information Gain and Machine Learning

Extracting nuggets (pieces of an answer) is a very important process in question answering systems, especially in the case of definition questions. Although there are advances in nugget extraction, the problem is finding some general and flexible patterns that allow producing as many useful definition nuggets as possible. Nowadays, patterns are obtained in manual or automatic way and then these...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007